2024-10-28 14:42:03.AIbase.
Meta Open Sources Long Video LLM Project LongVU: Filters Duplicate Frames for Efficient and Accurate Understanding of Long Video Content
2024-10-08 11:18:05.AIbase.
Apple Introduces MM1.5: A Revolution in Multimodal AI Models Redefining Intelligent Understanding?
2024-09-14 15:42:34.AIbase.
Apple Aims to Leverage the UI-JEPA Model to Understand User Intent on Devices
2024-09-02 11:17:38.AIbase.
NVIDIA Launches New Visual Speech Model NVEagle, Capable of Chatting with Images
2024-08-14 14:05:00.AIbase.
Tencent Launches First Open Source Multimodal Large Language Model VITA for Seamless Communication with Users
2024-07-01 09:14:16.AIbase.